Issues in evaluating semantic spaces using word analogies

نویسنده

  • Tal Linzen
چکیده

The offset method for solving word analogies has become a standard evaluation tool for vector-space semantic models: it is considered desirable for a space to represent semantic relations as consistent vector offsets. We show that the method’s reliance on cosine similarity conflates offset consistency with largely irrelevant neighborhood structure, and propose simple baselines that should be used to improve the utility of the method in vector space evaluation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks

Word embeddings have been found to provide meaningful representations for words in an efficient way; therefore, they have become common in Natural Language Processing systems. In this paper, we evaluated different word embedding models trained on a large Portuguese corpus, including both Brazilian and European variants. We trained 31 word embedding models using FastText, GloVe, Wang2Vec and Wor...

متن کامل

Psychometric Properties of the Persian Word Pairs Task for Declarative Memory Assessment

Objective: According to the declarative/procedural model, the semantic aspect of language depends on the brain structures responsible for declarative memory. The word pairs task is a common tool for evaluating declarative memory. The current study aimed to design a valid and reliable task for evaluating declarative memory in Persian children at learning and retention stages and to investigate i...

متن کامل

Word Embeddings as Metric Recovery in Semantic Spaces

Continuous word representations have been remarkably useful across NLP tasks but remain poorly understood. We ground word embeddings in semantic spaces studied in the cognitive-psychometric literature, taking these spaces as the primary objects to recover. To this end, we relate log co-occurrences of words in large corpora to semantic similarity assessments and show that co-occurrences are inde...

متن کامل

Evaluating vector-space models of analogy

Vector-space representations provide geometric tools for reasoning about the similarity of a set of objects and their relationships. Recent machine learning methods for deriving vectorspace embeddings of words (e.g., word2vec) have achieved considerable success in natural language processing. These vector spaces have also been shown to exhibit a surprising capacity to capture verbal analogies, ...

متن کامل

Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings

This paper introduces the concept of temporal word analogies: pairs of words which occupy the same semantic space at different points in time. One well-known property of word embeddings is that they are able to effectively model traditional word analogies (“word w1 is to word w2 as word w3 is to word w4”) through vector addition. Here, I show that temporal word analogies (“wordw1 at time tα is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016